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Abstract 


Introduce the ability to have several code sections in EOF-formatted (EIP-3540) 
bytecode, each one representing a separate subroutine/function. Two new 
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opcodes, caLLF and RETF , are introduced to call and return from such a function. 
Dynamic jump instructions are disallowed. 


Motivation 


Currently, in the EVM everything is a dynamic jump. Languages like Solidity generate 
most jumps in a static manner (i.e. the destination is pushed to the stack right before, 
PUSHn .. JUMP ). Unfortunately however this cannot be used by most EVM 
interpreters, because of added requirement of validation/analysis. This also restricts 
them from making optimisations and potentially reducing the cost of jumps. 


EIP-4200 introduces static jump instructions, which remove the need for most 
dynamic jump use cases, but not everything can be solved with them. 


This EIP aims to remove the need and disallow dynamic jumps as it offers the most 
important feature those are used for: calling into and returning from functions. 


Furthermore, it aims to improve analysis opportunities by encoding the number of 
inputs and outputs for each given function, and isolating the stack of each function 
(i.e. a function cannot read the stack of the caller/callee). 


Specification 
Type Section 


The type section of EOF containers must adhere to following requirements: 


1. The section is comprised of a list of metadata where the metadata index in 
the type section corresponds to a code section index. Therefore, the type 
section size MUST be n * 4 bytes, where n is the number of code sections. 

2. Each metadata item has 3 attributes: a uint8 inputs , a uint8 outputs , anda 
uint16 max_stack_height . Note: This implies that there is a limit of 255 stack 
for the input and in the output. This is further restricted to 127 stack items, 
because the upper bit of both the input and output bytes are reserved for 
future use. max_stack_height is further defined in EIP-5450. 

3. The first code section MUST have 0 inputs and 0 outputs. 


Refer to EIP-3540 to see the full structure of a well-formed EOF bytecode. 


New execution state in EVM 


A return stack is introduced, separate from the operand stack. It is a stack of items 
representing execution state to return to after function execution is finished. Each 
item is comprised of: code section index, offset in the code section (PC value), calling 
function stack height. 
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Note: Implementations are free to choose particular encoding for a stack item. In the 
specification below we assume that representation is three unsigned integers: 


code_section_index , offset, stack_height . 
The return stack is limited to a maximum 1024 items. 


Additionally, EVM keeps track of the index of currently executing section - 


current_section_index . 


New instructions 


We introduce two new instructions: 


1. CALLF ( exbe ) - call a function 
2. RETF ( @xb1 ) - return from a function 


If the code is legacy bytecode, any of these instructions results in an exceptional halt. 
(Note: This means no change to behaviour) 


First we define several helper values: 


e caller_stack_height = return_stack.top().stack_height - stack height value 
saved in the top item of return stack 

e type[i].inputs = type _section_contents[i * 4] - number of inputs of ith 
section 

e type[i].outputs = type_section_contents[i * 4 + 1] - number of outputs of 
ith section 


If the code is valid EOF1, the following execution rules apply: 


CALLF 


1. Has one immediate argument, code_section_index , encoded as a 16-bit 
unsigned big-endian value. 


2. EOF validation guarantees that operand stack has at least 

caller_stack_height + type[code_section_index].inputs items. 

3. If operand stack size exceeds 1024 - 
type[code_section_index].max_stack_height (i.e. if the called function may 
exceed the global stack height limit), execution results in exceptional halt. 
This also guarantees that the stack height after the call is within the limits. 

4. If return stack already has 1024 items, execution results in exceptional halt. 

5. Charges 5 gas. 

6. Pops nothing and pushes nothing to operand stack. 

7. Pushes to return stack an item: 


(code_section_index = current_section_index, 
offset = PC_post_instruction, 
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stack_height = data_stack.height - types[code_section_index].inputs) 


Under Pc_post_instruction we mean the PC position after the entire 
immediate argument of cALLF . Operand stack height is saved as it was 
before function inputs were pushed. 


Note: Code validation rules of EIP-5450 guarantee there is always an 
instruction following CALLF (since terminating instruction or unconditional 
jump is required to be final one in the section), therefore 
PC_post_instruction always points to an instruction inside section bounds. 
8. Sets current_section_index tO code _section_index and pc to @,and 
execution continues in the called section. 


RETF 


1. Does not have immediate arguments. 

2. EOF validation guarantees that operand stack has exactly caller_stack_height 
+ type[current_section_index].outputs items. 

3. Charges 3 gas. 

4. Pops nothing and pushes nothing to operand stack. 

5. Pops an item from return stack and sets current_section_index and pc to 
values from this item. 

1. If return stack is empty after this, execution halts with success. 


Code Validation 


In addition to container format validation rules above, we extend code section 
validation rules (as defined in EIP-3670). 


1. Code validation rules of EIP-3670 are applied to every code section. 
2. Code section is invalid in case an immediate argument of any CALLF is 
greater than or equal to the total number of code sections. 
3. RJUMP , RJUMPI and RJUMPV immediate argument value (jump destination 
relative offset) validation: 
1. Code section is invalid in case offset points to a position outside of 
section bounds. 
2. Code section is invalid in case offset points to one of two bytes 
directly following CALLF instruction. 


Disallowed instructions 


Dynamic jump instructions Jump ( @x56) and JumPI ( ex57 ) are invalid and their 
opcodes are undefined. 


JUMPDEST ( @x5b ) instruction is renamed to Nop (“no operation”) without the change 
in behaviour: it pops nothing and pushes nothing to operand stack and has no other 
effects except for Pc increment and charging 1 gas. 
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pc (0x58) instruction becomes invalid and its opcode is undefined. 


Note: This change implies that JUMPDEST analysis is no longer required for EOF 
code. 


Execution 


. Execution starts at the first byte of the first code section, and PC is set to 0. 

. Return stack is initialized to contain one item: (code_section_index = @, 
offset = @, stack_height = Q) 

. If any instruction access the operand stack item below caller_stack_height , 
execution results in exceptional halt. This rule replaces the old stack 
underflow check. 

4. No change in stack overflow check: if any instruction causes the operand 

stack height to exceed 1024 , execution results in exceptional halt. 


N — 


U 


Rationale 


RETF in the top frame ends execution vs exceptionally 
halts 


Alternative logic for executing RETF in the top frame could be to exceptionally halt 
execution, because there is arguably no caller for the starting function. This would 
mean that return stack is initialized as empty, and RETF exceptionally aborts when 
return stack is empty. 


We have decided in favor of always having at least one item in the return stack, 
because it allows to avoid having a special case for empty stack in the interpreter 
loop stack underflow check. We keep the stack underflow rule general by having 
caller_stack_height = @ in the top frame. 


Code section limit and instruction size 


The number of code sections is limited to 1024. This requires 2-byte immediate for 
CALLF and leaves room for increasing the limit in the future. The 256 limit (1-byte 
immediate) was discussed and concerns were raised that it might not be sufficient. 


NoP instruction 


Instead of deprecating JuMPDEST we repurpose it as noP instruction, because 
JuMPDEST effectively was a “no-operation” instruction and was already used as such in 
various contexts. It can be useful for some off-chain tooling, e.g. benchmarking EVM 
implementations (performance of Nop instruction is performance of EVM interpreter 
loop), as a padding to force code alignment, as a placeholder in dynamic code 
composition. 


Deprecating JUMPDEST analysis 
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The purpose of JumPDEST analysis was to find in code the valid jumppest bytes that 
do not happen to be inside pusH immediate data. Only dynamic jump instructions 
( JUMP , JUMPI ) required destination to be JumppesT instruction. Relative static jumps 
( RJUMP and RJuMPI ) do not have this requirement and are validated once at deploy- 
time EOF instruction validation. Therefore, without dynamic jump instructions, 
JUMPDEST analysis is not required. 


Backwards Compatibility 


This change poses no risk to backwards compatibility, as it is introduced only for 
EOF1 contracts, for which deploying undefined instructions is not allowed, therefore 
there are no existing contracts using these instructions. The new instructions are not 
introduced for legacy bytecode (code which is not EOF formatted). 


The new execution state and multi-section control flow pose no risk to backwards 
compatibility, because it is a generalization of executing a single code section. 
Executing existing contracts (both legacy and EOF1) has no user-observable changes. 


Security Considerations 
TBA 
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